On Combining Text and MeSH Searches to Improve the Retrieval of MEDLINE documents
نویسندگان
چکیده
The MEDLINE database is the world largest repository of bio-medical abstracts. It is a central information entry point for most biologists despite the growing availability of full-text articles on the WWW. Each article is manually annotated by MeSH terms to allow easy access and in order to improve retrieval, the MeSH fields of MEDLINE records were successfully used in the past with pseudo-relevance feedback and MeSH query expansion. However, previous experiments often ignored the MeSH field structure information. This paper investigates the impact of the MEDLINE MeSH field structure on a method that combines text and MeSH searches on a large subset of the MEDLINE database. Robertson’s Offer Weight technique is used to generate MeSH queries. Our method is evaluated within the TREC 2005 Genomics Track on the ad hoc task collection and our results show that this approach does significantly improve retrieval performance. MOTS-CLÉS : information biomédicale, ontologie, « pseudo relevance feedback », combinaison de recherches.
منابع مشابه
The MeSHSim package
MeSH(Medical Subject Headings) is a vocabulary thesaurus, being controlled by NLM(National Library of Medicine) to index MEDLINE documents. MeSH consists of a set of description terms, which are organized in a hierarchical structure(called MeSH trees), where more general terms appear at nodes closer to the root and more specific terms appear at nodes closer to leaves(Nelson et al., 2004). Each ...
متن کاملRetrieval feedback in MEDLINE.
OBJECTIVE To investigate a new approach for query expansion based on retrieval feedback. The first objective in this study was to examine alternative query-expansion methods within the same retrieval-feedback framework. The three alternatives proposed are: expansion on the MeSH query field alone, expansion on the free-text field alone, and expansion on both the MeSH and the free-text fields. Th...
متن کاملبررسی نقش انواع بافتار همنویسهها در تعیین شباهت بین مدارک
Aim: Automatic information retrieval is based on the assumption that texts contain content or structural elements that can be used in word sense disambiguation and thereby improving the effectiveness of the results retrieved. Homographs are among the words requiring sense disambiguation. Depending on their roles and positions in texts, homograph contexts could be divided to different types, wit...
متن کاملQuery Translation by Text Categorization
We report on the development of a cross language information retrieval system, which translates user queries by categorizing these queries into terms listed in a controlled vocabulary. Unlike usual automatic text categorization systems, which rely on dataintensive models induced from large training data, our automatic text categorization tool applies data-independent classifiers: a vector-space...
متن کاملUsing Fuzzy LR Numbers in Bayesian Text Classifier for Classifying Persian Text Documents
Text Classification is an important research field in information retrieval and text mining. The main task in text classification is to assign text documents in predefined categories based on documents’ contents and labeled-training samples. Since word detection is a difficult and time consuming task in Persian language, Bayesian text classifier is an appropriate approach to deal with different...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006